klotz: large language models*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. A collection of lightweight AI-powered tools built with LLaMA.cpp and small language models.
  2. This paper explores the structure of the feature point cloud discovered by sparse autoencoders in large language models. It investigates three scales: atomic, brain, and galaxy. The atomic scale involves crystal structures with parallelograms or trapezoids, improved by projecting out distractor dimensions. The brain scale focuses on modular structures, similar to neural lobes. The galaxy scale examines the overall shape and clustering of the point cloud.
    2024-11-06 Tags: , , , by klotz
  3. The article discusses the emerging role of AI agents as distinct users, requiring designers to adapt their practices to account for the needs and capabilities of these intelligent systems.

    - Agents are becoming active users in systems, requiring designers to extend UX principles to include both humans and A and agents.
    - The future of UX lies in understanding and designing for Agent-Computer Interaction.
    2024-11-06 Tags: , , , by klotz
  4. Replace traditional NLP approaches with prompt engineering and Large Language Models (LLMs) for Jira ticket text classification. A code sample walkthrough.
  5. A comparison of frameworks, models, and costs for deploying Llama models locally and privately.

    - Four tools were analyzed: HuggingFace, vLLM, Ollama, and llama.cpp.
    - HuggingFace has a wide range of models but struggles with quantized models.
    - vLLM is experimental and lacks full support for quantized models.
    - Ollama is user-friendly but has some customization limitations.
    - llama.cpp is preferred for its performance and customization options.
    - The analysis focused on llama.cpp and Ollama, comparing speed and power consumption across different quantizations.
    2024-11-03 Tags: , , , , , by klotz
  6. All Hands AI has released OpenHands CodeAct 2.1, an open-source software development agent that can solve over 50% of real GitHub issues in SWE-Bench. The agent uses Anthropic’s Claude-3.5 model, function calling, and improved directory traversal to achieve this milestone.
    2024-11-02 Tags: , , , , , by klotz
  7. Visa is leveraging artificial intelligence across numerous aspects of its operations, with no plans to slow down its implementation.
    2024-11-02 Tags: , , , by klotz
  8. Docling is a tool that parses documents and exports them to desired formats like Markdown and JSON. It supports various document formats including PDF, DOCX, PPTX, Images, HTML, AsciiDoc, and Markdown.
    2024-11-01 Tags: , , , , , , , , , , by klotz
  9. The post discusses the feasibility of fine-tuning a decoder-encoder model to translate Egyptian Middle Kingdom hieroglyphics into English. The author suggests that with sufficient training data and a tokenizer that includes Egyptian characters, the model could learn to interpret hieroglyphics fluently. Comments from users mention using plugins and existing knowledge in models as alternatives to fine-tuning.
  10. This article summarizes various techniques and goals of language model finetuning, including knowledge injection and alignment, and discusses the effectiveness of different approaches such as instruction tuning and supervised fine-tuning.

Top of the page

First / Previous / Next / Last / Page 3 of 0 SemanticScuttle - klotz.me: Tags: large language models

About - Propulsed by SemanticScuttle